Bilingual Insights into the Initial Lexicon

The Role of Cognates in Word Acquisition

Gonzalo Garcia-Castro

PhD Defence / Departament de Medicina i Ciències de la Vida

2024-11-03

The initial lexicon

Average 20-year-old knows ~42,000 lemmas: mental lexicon

Lexical representations
Phonological, conceptual, grammatical information of known words
Form-meaning association

First lexical representations at 6-9 months

Normative trajectories of lexical development

Figure 1: Vocabulary size norms for 51,800 monolingual children learning 35 distinct languages (wordbank)

Bilinguals face additional challenges, but do not lag behind

  • Increased complexity in linguistic context (learning two codes)
  • Reduced linguistic input (split into two languages)
  • Increased referential ambiguity


Hoff et al. (2012): bilinguals acquire words at similar rates as monolinguals

47 English-Spanish bilinguals, 56 English monolinguals in Florida

Lexical similarity modulates vocabulary acquisition in bilinguals

Floccia et al. (2018): CDI response of 372 bilinguals (UK) learning English + additional language

Lexical similarity:
Average phonological similarity (Levenshtein similarity) between pairs of translations

English-Dutch (22.14%) > English-Mandarin (1.97%)

Higher lexical similarity, larger vocabulary size

Stronger effect in the additional language (e.g., Dutch, Mandarin)

Lexical similarity modulates vocabulary acquisition in bilinguals

Figure 2: Pairwise lexical similarity (average Levensthein similarity across translations).

A cognate facilitation in lexical acquisition?

Cognates: phonologically-similar translation equivalents

Cognate Non-cognate
[cat] /ˈgat-ˈga.to/ [dog] /ˈgos-ˈpe.ro/

Some evidence that cognates acquired earlier than non-cognates (Mitchell, Tsui, and Byers-Heinlein 2023; Bosch and Ramon-Casas 2014)


What mechanisms support a cognate facilitation during word acquisition?

Language non-selective lexical access

Activation spreads across non-selected representations in both languages, through phonological and conceptual links. (e.g., Costa, Caramazza, and Sebastian-Galles 2000)

Evidence in children (Bosma and Nota 2020; De Houwer, Bornstein, and Putnick 2014) and infants (Von Holzen and Mani 2012; Jardak and Byers-Heinlein 2019; Singh 2014).

The present dissertation

Study 1

  1. Provide a mechanistic account for the cognateness facilitation
  2. Test predictions of the model

Study 2

  1. Test core assumption of the model: language non-selectivity in the initial lexicon

Study 1

Cognate beginnings to lexical acquisition: the AMBLA model

Accumulator Model of Bilingual Lexical Acquisition (AMBLA)

  1. Information about form-meaning mappings is provided by learning instances

Exposure to a word-form that results in the accumulation of information about its meaning

  1. Age of acquisition: the infant accumulates a threshold amount of learning instances for a word-form

\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{grey}{RGB}{128, 128, 128} \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ {\color{mygreen}\text{Age of Acquisition}_{ij}} &= \{\text{Age}_i \mid {\color{myred}\text{Learning instances}_{ij}} = {\color{myblue}\text{Threshold}} \}\\ \color{myred}{\text{Learning instances}_{ij}} &= \text{Age}_i \cdot \text{Freq}_j \\ \textbf{where:} \\ {\color{myblue}\text{Threshold}} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \end{aligned} \]

AMBLA: monolingual word acquisition

AMBLA: monolingual word acquisition

AMBLA: monolingual word acquisition

AMBLA: bilingual word acquisition

Catalan 60%, Spanish 40%

  1. Linguistic input divided into two languages

Exposure: proportion of time exposed to the language of \(j\) word

Accumulation of learning instances, a function of Exposure and Frequency.

\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{myorange}{RGB}{ 235, 127, 26 } \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ \text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\ \text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot {\color{myred}\text{Exposure}_{ij}}\\ \textbf{where:} \\ \text{Threshold} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \end{aligned} \]

AMBLA: bilingual word acquisition

AMBLA: bilingual word acquisition

AMBLA: cognate facilitation

  1. Words may accumulate additional learning instances from the co-activation of their (phonologically similar) translation equivalent

Degree proportional to their phonological similarity (Cognateness)

\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{myorange}{RGB}{ 235, 127, 26 } \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ \text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\ \text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot \text{Exposure}_{ij} + \\ &({\color{myred}\text{Learning instances}_{ij'} \cdot {\text{Cognateness}}_{j}})\\ \textbf{where:} \\ \text{Threshold} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \\ {\color{myred}\text{Cognateness}}&{\color{myred} = \text{Levenshtein}(j, j')} \end{aligned} \]

AMBLA: cognate facilitation

AMBLA: cognate facilitation

Predictions

  1. Cognates acquired earlier than non-cognates
  2. Cognateness facilitation stronger in the lower-exposure language

Dataset

  • Barcelona Vocabulary Questionnaire (BVQ): 302 Catalan-Spanish noun translation equivalents
  • 436 administrations (366 children)
  • Age: 12-32 months (M = 22.2, SD = 4.9)
  • Lang. exposure: Catalan and Spanish (\(\leq\) 10% 3rd language)
  • 138,078 item responses (No, Understands, Understands and Says)

Modelling and statistical inference

p(Comprehension \(<\) Production) ~ Ordinal, multilevel (Bayesian) regression model

response ~ age * exposure * cognateness + length + (...|child) + (...|te) 

\[ \begin{aligned} \text{Exposure}_{ij} &= \text{Frequency}_j \times \text{Language degree of exposure}_{ij} \\ \text{Cognateness}_{j} &= \text{Levenshtein}(j, j') \end{aligned} \]

Results

Figure 3: Marginal posterior predictions

Discussion

Earlier acquisition for cognates vs. non-cognates

Cognate facilitation moderated by exposure

Only words from the lower exposure benefit from cognateness Parallel to language dominance effects in adults?

Cognateness as a candidate mechanism underlying Floccia et al. (2018)’s results

Cross-language facilitation via co-activation of phonologically similar translation equivalents

Is language-non selectivity already present?

Study 2

Developmental trajectories of bilingual spoken word recognition

Language non-selectivity in the initial lexicon

Priming through translation task:

Von Holzen and Mani (2012): German-English bilinguals (N = 20, 21-43 months)

Implicit naming task:

Mani and Plunkett (2010): English monolinguals

Mani and Plunkett (2010)

Mani and Plunkett (2010)

Experiment 1: predictions

If language non-selectivity, stronger interference in cognate vs. non-cognate trials

Experiment 1: participants

Replication study

N = 112 children (15 longitudinal)

Aged 26.36 months (SD = 4.01, Range = 20.03–32.5)

English monolinguals, Oxford (United Kindgom) (as in Mani and Plunkett 2010)

Proportion of target looking (PTLT) ~ (Bayesian) GAMMs

Experiment 1: results (monolinguals)

Figure 4: Time course of target fixations in Experiment 1.

Experiment 2: participants

N = 162 children (81 longitudinal)

  • 107 sessions in monolinguals (>= 80% exposure to Catalan/Spanish)
  • 135 session in bilinguals

Exposed to Catalan or Spanish (Metropolitan Area of Barcelona, Spain)

Aged 25.36 months (SD = 4.01, Range = 20.03–32.5)

Experiment 2: results (monolinguals)

Figure 5: Time course of target fixations in Experiment 1.

Experiment 2: results (bilinguals)

Figure 6: Time course of target fixations in Experiment 1.

Discussion

Successful word recognition across:

  • Ages
  • Language profiles
  • Vocabulary sizes

No evidence of priming effects, within or across languages

Most likely due to design issues

General discussion

Summary

Cognateness facilitates word acquisition in the lower-exposure language

Candidate mechanism behind bilingual vocabulary growth

AMBLA: Cross-language accumulation of learning instances

Language non-selectivity in the initial lexicon: pending severe testing

Towards a model of bilingual lexical acquisition

Explanation for Floccia et. (2018)

Asymmetry in adult models of lexical processing

AMBLA: natural extension of the Standard Model of language acquisition? (Kachergis, Marchman, and Frank 2022)

Limitations

Design caviats

Generalisability? Language pairs with fewer cognates

Does cognateness impact the acquisition of other grammatical categories (e.g., verbs, adjectives)

Word acquisition vs. word learning

Conclusions

  • Insights into the developing bilingual lexicon: cognateness
  • Evidence in favour of language non-selectivity as the underlying mechanisms behind the cognate facilitation
  • Important consequences for bilingual vocabulary growth

Thanks

Appendix

Study 1: posterior regression coefficients

Figure 7: Aggregated vocabularies might conceal facilitation effects

Experiment 1: predictions

  • Successful spoken word recognition across groups
  • If language non-selectivity, stronger interference in cognate vs. non-cognate trials

Experiment 1: vocabulary Oxford

Figure 8: Participant receptive vocabulary sizes across ages and language profiles.

Experiment 2: vocabulary Barcelona

Figure 9: Participant receptive vocabulary sizes across ages and language profiles.

References

Bergelson, Elika, and Daniel Swingley. 2012. “At 69 Months, Human Infants Know the Meanings of Many Common Nouns.” Proceedings of the National Academy of Sciences 109 (9): 3253–58. https://doi.org/10.1073/pnas.1113380109.
Bosch, Laura, and Marta Ramon-Casas. 2014. “First Translation Equivalents in Bilingual Toddlers’ Expressive Vocabulary: Does Form Similarity Matter?” International Journal of Behavioral Development 38 (4): 317–22. https://doi.org/10.1177/0165025414532559.
Bosma, Evelyn, and Naomi Nota. 2020. “Cognate Facilitation in FrisianDutch Bilingual Children’s Sentence Reading: An Eye-Tracking Study.” Journal of Experimental Child Psychology 189: 104699. https://doi.org/10.1016/j.jecp.2019.104699.
Costa, Albert, Alfonso Caramazza, and Nuria Sebastian-Galles. 2000. “The Cognate Facilitation Effect: Implications for Models of Lexical Access.” Journal of Experimental Psychology: Learning, Memory, and Cognition 26 (5): 1283. https://doi.org/10.1037/0278-7393.26.5.1283.
De Houwer, Annick, Marc H Bornstein, and Diane L Putnick. 2014. “A Bilingualmonolingual Comparison of Young Children’s Vocabulary Size: Evidence from Comprehension and Production.” Applied Psycholinguistics 35 (6): 1189–1211. https://doi.org/10.1017/s0142716412000744.
Fenson, Larry, Philip S Dale, J Steven Reznick, Elizabeth Bates, Donna J Thal, Stephen J Pethick, Michael Tomasello, Carolyn B Mervis, and Joan Stiles. 1994. “Variability in Early Communicative Development.” Monographs of the Society for Research in Child Development 59 (5): 1–185. https://doi.org/10.2307/1166093.
Floccia, Caroline, Thomas D. Sambrook, Claire Delle Luche, Rosa Kwok, Jeremy Goslin, Laurence White, Allegra Cattani, et al. 2018. “I: Introduction.” Monographs of the Society for Research in Child Development 83 (1): 7–29. https://doi.org/10.1111/mono.12348.
Hoff, Erika, Cynthia Core, Silvia Place, Rosario Rumiche, Melissa Señor, and Marisol Parra. 2012. “Dual Language Exposure and Early Bilingual Development.” Journal of Child Language 39 (1): 1–27. https://doi.org/10.1017/s0305000910000759.
Jardak, Amel, and Krista Byers-Heinlein. 2019. “Labels or Concepts? The Development of Semantic Networks in Bilingual Two-Year-Olds.” Child Development 90 (2): e212–29. https://doi.org/10.1111/cdev.13050.
Kachergis, George, Virginia A. Marchman, and Michael C. Frank. 2022. “Toward a Standard Model’ of Early Language Learning.” Current Directions in Psychological Science 31 (1): 20–27. https://doi.org/10.1177/09637214211057836.
Mani, Nivedita, and Kim Plunkett. 2010. “In the Infant’s Mind’s Ear: Evidence for Implicit Naming in 18-Month-Olds.” Psychological Science 21 (7): 908–13. https://doi.org/10.1177/0956797610373371.
Mitchell, Lori, Rachel Ka-Ying Tsui, and Krista Byers-Heinlein. 2023. “Cognates Are Advantaged over Non-Cognates in Early Bilingual Expressive Vocabulary Development.” Journal of Child Language, 1–20.
Singh, Leher. 2014. “One World, Two Languages: Cross-Language Semantic Priming in Bilingual Toddlers.” Child Development 85 (2): 755–66. https://doi.org/10.1111/cdev.12133.
Tincoff, Ruth, and Peter W Jusczyk. 1999. “Some Beginnings of Word Comprehension in 6-Month-Olds.” Psychological Science 10 (2): 172–75. https://doi.org/10.1111/1467-9280.00127.
Von Holzen, Katie, and Nivedita Mani. 2012. “Language Nonselective Lexical Access in Bilingual Toddlers.” Journal of Experimental Child Psychology 113 (4): 569–86. https://doi.org/10.1016/j.jecp.2012.08.001.